Hotel Booking¶

Hotel booking demand datasets(Data in Brief:2019)¶

Description¶

Context¶

This dataset contains 119390 observations for a City Hotel and a Resort Hotel. Each observation represents a hotel booking between the 1st of July 2015 and 31st of August 2017, including booking that effectively arrived and booking that were canceled.

Content¶

Since this is hotel real data, all data elements pertaining hotel or costumer identification were deleted.Four Columns, 'name', 'email', 'phone number' and 'credit_card' have been artificially created and added to the dataset.

Acknowledgements¶

The data is originally from the article Hotel Booking Demand Datasets, written by Nuno Antonio, Ana Almeida, and Luis Nunes for Data in Brief, Volume 22, February 2019.

Predicting cancelations¶¶

It would be nice for the hotels to have a model to predict if a guest will actually come.
This can help a hotel to plan things like personel and food requirements.
Maybe some hotels also use such a model to offer more rooms than they have to make more money... who knows...

Outlines¶

  • Import libraries & Data
  • Read and know the data
  • Assessing_Data
  • Cleaning_Data
  • Analyze and Visualzing Data
  • Conclusions
  • Reference

Observation¶

This project is a general project to explore and analyze this data without having a major problem to solve, so you will find that I worked on most of the variables and the relationships between them to explore and analyze most of the data

Analyzing and visualizng Data¶

1. Univariate Exploration¶

Here i will exploring the data by answer some question like :

questions¶

  1. Is there relationship between having babies and the rate of cancellation ? (Statistically)
  2. Is cancellation of reservations highest in the resort about the city hotel? (Statistically)
  3. what the kind of hotel that have highest reservations?
  4. Wich year has the highest number of bookings ?
  5. Distribution of staye in week nights on count of days
  6. Distribution of staye in weekend nights on count of days
  7. DISTRIBUTION OF count leaad time for reservations
  8. What is the most kind of Board on Reservations ?
  9. What are the countries with the highest bookings?

1. Is there relationship between having babies and the rate of cancellation ? (Statistically)¶

Statistically, we can see that the rate of cancellation of reservations decreases whenever there are babies until it reaches 0.13 when there 2 babies , 0.183 for 1 babies and 0.377 when there NO babies¶

2. Is cancellation of reservations highest in the resort about the city hotel? (Statistically)¶

here we can see that City hotel have A haighest cancellation of reservations by rate up to 41%¶

and Resort hotel is 27%¶

what the kind of hotel that have highest reservations?¶

Text(0.5, 1.0, 'count of reservvations on every hotel')

Here we can see that CITY_HOTEL have double the number of RESORT_HOTEL¶

Wich year has the highest number of bookings ?¶

Text(0.5, 1.0, 'count of reservations for every year')

See the start and end date of booking reservations¶

('1/1/2015', '9/9/2017')

we can see that 2016 have highest number of reservations BUT we dont have the last quaerter of 2017 SO we can't say that 2016 have haighest reservation than 2017¶

Distribution of staye in week nights on count of days¶

Text(0.5, 1.0, 'Distribution of staye in week nights')

Most of the nights for a week stay are two or three days, and there are some reservations that have more than 10 days, but they are too few to be outliers .¶

Distribution of staye in weekend nights on count of days¶

<AxesSubplot:xlabel='stays_in_weekend_nights', ylabel='count'>

Most of reservation don'd stay in weekend nights , or stays one or two days¶

DISTRIBUTION OF count leaad time for reservations¶

Text(0.5, 1.0, 'DISTRIBUTION OF count leaad time for reservations')

LEAD_TIME has a long tail distributions and most of reservation has lead_time between [0:50] days¶

What is the most kind of Board on Reservations ?¶

What does the board basis mean?¶

SC (Self Catering) : No meals are included; however, your accommodation will be provided with catering facilities for you to cook light meals.

BB (Bed and Breakfast) : Breakfast is included.

HB (Half Board) : Breakfast and evening meals are included. In some cases, you can choose to receive lunch instead of breakfast – the hotel will confirm this on arrival.

FB (Full Board) : Breakfast, lunch and evening meals are included.

Text(0.5, 1.0, ' the most kind of Board on Reservations')

Here we can see that most of reservations ask BB (Bed and Breakfast)¶

What are the countries with the highest bookings?¶

Text(0.5, 1.0, ' the countries with the highest bookings')

Portugal is the most booked country¶

1. Is there a difference in the average cancellation of reservations between years¶

Text(0.5, 1.0, 'The difference in the average cancellation of reservations between years ')

the rate of cancellation of reservation increasees every year but at simple rates¶

Is there a noticeable change in Average Daily Rate for different months ?¶

<AxesSubplot:xlabel='arrival_date_month'>

Price of room types per night and person¶

This figure shows the average price per room, depending on its type and the standard deviation. Note that due to data anonymization rooms with the same type letter may not necessarily be the same across hotels¶

How does the price per night vary over the year?¶

it's clearly that the price in the resort hotel increase in summer [ Augest , july] , but city hotel increase in [April , May ,June ]¶

Where do the guests come from?¶

Most guests are from Portugal , Britain ,and France¶

The most important conclusions¶

  1. The CITY_HOTEL have double of the number of RESORT_HOTEL
  2. 2016 have highest number of reservations BUT we dont have the last quaerter of 2017 SO we can't say that 2016 have haighest reservation than 201
  3. The third quarter of the year has the highest rate of bookings, especially the month of August
  4. Most of the nights for a week stay are two or three days, and there are some reservations that have more than 10 days, but they are too few to be outliers
  5. Most of reservation don'd stay in weekend nights , or stays one or two days
  6. LEAD_TIME has a long tail distributions and most of reservation has lead_time between [0:50] days
  7. The most of reservations ask BB (Bed and Breakfast
  8. Portugal is the most booked country

  9. the rate of cancellation of reservation increasees every year but at simple rates

  10. he average price per room, depending on its type and the standard deviation. Note that due to data anonymization rooms with the same type letter may not necessarily be the same across hotels
  11. the price in the resort hotel increase in summer [ Augest , july] , but city hotel increase in [April , May ,June ]¶

References¶

  1. stack_over_flow
  2. panda
  3. Kaggle

Created by "Data_Analyst" : Shuaib Alamrity¶

Github Linkedin